-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(dev/docker/hive): shrink hive Docker image size by 420MB #3268
refactor(dev/docker/hive): shrink hive Docker image size by 420MB #3268
Conversation
reopen for rerunning ci tests |
@mchades Would you like to review this PR ? |
@@ -83,6 +83,7 @@ public void startHiveContainer() { | |||
HiveContainer.Builder hiveBuilder = | |||
HiveContainer.builder() | |||
.withHostName("gravitino-ci-hive") | |||
.withImage("unknowntpo/gravitino-ci-hive:latest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use my hive image here, we need to upload newly created gravitino-ci-hive
image before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @unknowntpo , It's not good practice to use your image here, @xunliu Please help to verify if changes about the Hive image are reasonable, If everything is OK, we can release the image before this PR is merged, then @unknowntpo can change the image to the newly released one.
@unknowntpo Another point is that you may need to modify the related document about the Hive image, please see
https://github.com/datastrato/gravitino/blob/14297cddc894a4d47851ddfbcaec50bc547e0387/docs/docker-image-details.md?plain=1#L87C1-L88C1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, if @xunliu find this modification reasonable and push new image to docker hub, I'll update the changelog in docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've add some todos in the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @unknowntpo , It's not good practice to use your image here, @xunliu Please help to verify if changes about the Hive image are reasonable, If everything is OK, we can release the image before this PR is merged, then @unknowntpo can change the image to the newly released one.
@unknowntpo Another point is that you may need to modify the related document about the Hive image, please see https://github.com/datastrato/gravitino/blob/14297cddc894a4d47851ddfbcaec50bc547e0387/docs/docker-image-details.md?plain=1#L87C1-L88C1
@yuqi1129 Can you help release this Docker image to hub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @unknowntpo , It's not good practice to use your image here, @xunliu Please help to verify if changes about the Hive image are reasonable, If everything is OK, we can release the image before this PR is merged, then @unknowntpo can change the image to the newly released one.
@unknowntpo Another point is that you may need to modify the related document about the Hive image, please see14297cd
/docs/docker-image-details.md?plain=1#L87C1-L88C1@yuqi1129 Can you help release this Docker image to hub?
OK
aadf3c6
to
b4fd3a5
Compare
@@ -45,8 +64,6 @@ RUN apt-get update && apt-get upgrade -y && apt-get install --fix-missing -yq \ | |||
RUN mkdir /root/.ssh | |||
RUN cat /dev/zero | ssh-keygen -q -N "" > /dev/null && cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys | |||
|
|||
COPY packages /tmp/packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use ADD
, if use ADD in docker file for a zip file, docker builder will copy and unzip it, it will reduce the image size.
then, seems we do not need two stage build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for you advise! I used ADD
in the beginning, but I found that we need --strip-component
to remove outer directory, ADD
cannot do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, --strip-component
did not worked in ADD
, so I use a soft link in Doris Image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll try the soft-link approach, thanks 😀
@@ -3,14 +3,33 @@ | |||
# This software is licensed under the Apache License version 2. | |||
# | |||
|
|||
FROM ubuntu:16.04 | |||
LABEL maintainer="support@datastrato.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need kept LABEL maintainer="support@datastrato.com"
in the here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I don't use 2 stage build right now, this review is outdated.
dev/docker/hive/Dockerfile
Outdated
@@ -3,14 +3,33 @@ | |||
# This software is licensed under the Apache License version 2. | |||
# | |||
|
|||
FROM ubuntu:16.04 | |||
LABEL maintainer="support@datastrato.com" | |||
FROM ubuntu:16.04 AS packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think better change to FROM ubuntu:16.04 AS install-hadoop
, The packages
name ease confuse with /tmp/packages
directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I don't use 2 stage build right now, this is not needed.
dev/docker/hive/Dockerfile
Outdated
|
||
# hadoop | ||
RUN mkdir ${HADOOP_HOME} | ||
RUN tar -xz -C ${HADOOP_HOME} --strip-components 1 -f /tmp/packages/${HADOOP_PACKAGE_NAME} && rm -rf /tmp/packages/${HADOOP_PACKAGE_NAME} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need use --strip-component
params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same reason as #3268 (comment)
b4fd3a5
to
8c3ffa5
Compare
I use soft-link approach as @zhoukangcn mentioned at #3268 (comment), because we don't need to maintain lots of env var like |
hi @unknowntpo thank you for your continue work. |
@xunliu Shrinked hive Docker image: 1.85 GB (with soft-link) Original Docker image: 2.27 GB It's not 500 MB, but 420 MB 😭 |
hi @unknowntpo Please fix branch conflicts. Thanks |
4fc6112
to
38fde2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@unknowntpo |
I have released gravitino-ci-hive:0.1.12. Please upgrade it to Gravitino from version 0.1.11 to 0.1.12. |
6b8a3e1
to
87acb60
Compare
87acb60
to
200c0e8
Compare
@yuqi1129 Done. |
I mean you need to change the following place: There should be several similar places. |
200c0e8
to
acf79aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@yuqi1129 done, the tests has passed. |
) ### What changes were proposed in this pull request? Use multi-stage build to reduce hive image size from 2.27GB to 1.83GB. I use first stage to download archive file and unzip them, then in the final stage, copied them to destination directory. ### Why are the changes needed? Fix: #3262 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit tests. ## TODO - [ ] @mchades needs to push new image release to docker hub - [ ] @unknowntpo should update changelog
…ache#3268) ### What changes were proposed in this pull request? Use multi-stage build to reduce hive image size from 2.27GB to 1.83GB. I use first stage to download archive file and unzip them, then in the final stage, copied them to destination directory. ### Why are the changes needed? Fix: apache#3262 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit tests. ## TODO - [ ] @mchades needs to push new image release to docker hub - [ ] @unknowntpo should update changelog
What changes were proposed in this pull request?
Use multi-stage build to reduce hive image size from 2.27GB to 1.83GB.
I use first stage to download archive file and unzip them, then in the final stage, copied them to destination directory.
Why are the changes needed?
Fix: #3262
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Unit tests.
TODO